-
Notifications
You must be signed in to change notification settings - Fork 360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support aten.resize_ converter #2874
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good, just left some comments for your reference.
Although the additional elements are unpredictable but they are close enough to zero, as you said, I think it is able to pass our checkpoints since the tolerance is acceptable
@apbose @zewenli98 The converter for However, while investigating the cause of the CI/CD failure, I found that tensors in uninitialized locations sometimes contain extremely large values, whereas the description always showed values very close to zero (in the range of |
Use run_tests custom compare and check the original elements and the shape to verify correctness |
@chohk88 I was talking about the function |
eaecfb2
to
8f6805a
Compare
): | ||
comp_func = comparator[0] | ||
args = comparator[1] | ||
self.assertTrue(comp_func(output_trt, output_cpu, *args)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a specific case where the len(cuda_inputs) == 1 is required? In general I assume the len(cuda_inputs)
would be 1 in most cases. And since it conditioned on res_trt
could you highlight the cases where it would be required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your comment!
As you mentioned, in most cases, the length of cuda_inputs
is 1. Below are the outputs when I set a breakpoint in the original code for two cases where torch.ops.aten.resize_.default(x, target_shape)
returns tensors with shapes (3,)
and (10, 15, 10)
.
In both cases, the length of cuda_inputs
is also 1
. When cuda_inputs
has a length of 1
, res_trt
and res_cpu
are not lists of length 1
but are torch.Tensors
with shapes (3,)
and (10, 15, 10)
for each case. Therefore, when we use zip
in for output_trt, output_cpu, comparator in zip(res_trt, res_cpu, comparators)
, and the comparators list has a length of 1
, it results in comparing fewer elements because one dimension is lost from res_trt
and res_cpu
.
4c80772
to
6bcadc5
Compare
Description
Support converter for
aten.resize_
operation: https://pytorch.org/docs/stable/generated/torch.Tensor.resize_.html#torch.Tensor.resize_One critical aspect of this operation is handling cases where the target size (output size) is larger than the input tensor size. When
the target size
is larger thanthe input tensor size
, the values of the additional elements are unpredictable. In PyTorch, these additional elements are not initialized, which can result in values that are close tozero
but are not guaranteed to bezero
.In the converter developed for this PR, the additional elements are initialized to
zero
usingnumpy.zeros
. While this approach ensures a predictable output, it does not exactly replicate the behavior of PyTorch, where the values of the additional elements are not initialized and can be unpredictable.Fixes # (issue)
Type of change
Checklist: